introduction to rmarkdown
Anna Krystalli
Instituto de EcologĂa, UNAM 31 Aug. 2016
introduction
markdown .md
stripped down html
intended to be as easy-to-read and easy-to-write as possible.
intended for one purpose: to be used as a format for writing for the web.
syntax is very small, corresponding only to a very small subset of HTML tags.
focus on communicating & disseminating
formatting handled automatically
clean and legible across platforms and outputs
rmarkdown .Rmd
rmarkdown integrates:
– a documentantion language (.md)
with:
– a programming language (R)
enables literate programming
single document to integrate data analysis with textual representations, linking data, code, and text
outputs
it’s already everywhere!
Rmarkdown & reproducibility
Computational science has led to exciting new developments:
Technology is increasing data collection throughput; data are more complex and highdimensional
Existing databases can be merged to become bigger databases
Computing power allows more sophisticated analyses, even on “small” data
For every field “X” there is a “Computational X”
Increasing computational complexity of analyses:
has exposed limitations in our ability to evaluate published findings.
Even basic analyses difficult to describe
Errors more easily introduced into long analysis pipelines
Knowledge transfer is inhibited
Results are difficult to replicate or reproduce
Complicated analyses cannot be trusted
calls for reproducibility
Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.
fully scripted analyses
make code and data available
a reproducible workflow
VIDEO
reproducibility limitations
evidence based science
evdence needs:
documenting
linking
communicating
rmarkdown can integrate tools, processes and outputs into evidence streams that are easily shareable
at all stages of scientific process
Science and the web
the web is for sharing!
why sharing is important
To help solve these problems, we make a number of suggestions including providing blog posts or videos to explain new methods in less technical terms, encouraging reproducibility and code sharing, making wiki-style pages summarising the literature on popular methods, more careful consideration and testing of whether a method is appropriate for a given question/data set, increased collaboration, and a shift from publishing purely novel methods to publishing improvements to existing methods and ways of detecting biases or testing model fit. Many of these points are applicable across methods in ecology and evolution, not just phylogenetic comparative methods.
Let’s go have a look
Open your first .Rmd!!
Elements of .Rmd
themes
md basics
text
normal text
normal text
*italic text*
italic text
**bold text**
bold text
***bold italic text***
bold italic text
superscript^2^
superscript2
~~strikethrough~~
strikethrough
quotes & code
> this text will be quoted
this text will be quoted
`this text will appear as code` inline
this text will appear as code inline
a <- 10
the value of parameter *a* is `r a`
the value of parameter a is 10
images


resize images
<img src="assets/cheat.png" width="200px" />
basic tables
Table Header | Second Header
------------- | -------------
Cell 1 | Cell 2
Cell 3 | Cell 4
Cell 1
Cell 2
Cell 3
Cell 4
online .md table converter
links
[Download R](http://www.r-project.org/)
[RStudio](http://www.rstudio.com/)
Download R
RStudio
chunks
R code chunks can be used as a means render R output into documents or to simply display code for illustration
set default options
knitr::opts_chunk$set(echo = TRUE, warning = F, message = F)
Exercise
your mission
create your first .Rmd!
choose some data eg:
show us some data in a table
plot some data
write a bit about what you did
publish it on rpubs. Add you link to our googledoc
see my example: beavers! html - raw .Rmd

## **introduction to `rmarkdown`**

### **Anna Krystalli**
##### ***Instituto de Ecología, UNAM 31 Aug. 2016***

<br>

###### <https://annakrystalli.github.io/UNAM/rmarkdown.nb.html>

###### [\@annakrystalli](https://twitter.com/annakrystalli) | annakrystalli@googlemail.com


<br>
<br>

# **introduction**

## **markdown `.md`**

#### stripped down **`html`**

<br>

-  intended to be as **easy-to-read** and **easy-to-write** as possible.
-  intended for one purpose: to be used as a **format for writing for the web.**
-  syntax is very small, corresponding only to a very small subset of HTML tags.

<br>

## **focus on communicating & disseminating**

<br>

- formatting handled automatically
- clean and legible across platforms and outputs

## **structure of a website**

![](assets/html-css-javascript.png)

## **rmarkdown `.Rmd`**

#### `rmarkdown` integrates:

–  **a documentantion language (`.md`)**

with:

–  **a programming language (`R`)**

<br>

#### enables literate programming

single document to integrate data analysis with textual representations, **linking data, code, and text**

<br>

## **outputs**

<img src="assets/RMarkdownOutputFormats.png", width="400px"/>

<br>

## **it's already everywhere!**

- [github READMEs; eg rOpensci taxise README](https://github.com/ropensci/taxize)

- [stackoverflow: eg plot coordinates on a map](http://stackoverflow.com/questions/23130604/r-plot-coordinates-on-map)
    
- [github.io websites: eg Andy South's blog](http://andysouth.github.io/blog-setup/)

<br>
<br>

***

# **`Rmarkdown` & reproducibility**

<br>

## {data-background="http://49.media.tumblr.com/7daeff1d410bf5dce4c0b30c40b44f77/tumblr_n9dy9olLJe1qav3uso2_500.gif"}

> <span style="color:black"> Computational science has led to exciting new developments:</span>

-  <span style="color:black"><b> Technology is increasing data collection throughput; data are more complex and highdimensional</b></span>
-  <span style="color:black"><b> Existing databases can be merged to become bigger databases</b></span>
-  <span style="color:black"><b> Computing power allows more sophisticated analyses, even on "small" data</b></span>
-  <span style="color:black"><b> For every field "X" there is a "Computational X"</b></span>

<br>

## {data-background="https://raw.githubusercontent.com/BillMills/scienceXpython/gh-pages/img/debugging.gif"}

<span style="color:black"><b>Increasing computational complexity of analyses:</b></span>

> <span style="color:black">has exposed limitations in our ability to evaluate published findings.</span>


- <span style="color:black"><b>Even basic analyses difficult to describe</b></span>

- <span style="color:black"><b>Errors more easily introduced into long analysis pipelines</b></span>

- <span style="color:black"><b>Knowledge transfer is inhibited</b></span>

- <span style="color:black"><b>Results are difficult to replicate or reproduce</b></span>

- <span style="color:black"><b>Complicated analyses cannot be trusted</b></span>

<br>
<br>

## **calls for reproducibility**

<br>

>  Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.

## 

- **fully scripted analyses**
- **make code and data available**

<img src="assets/repro_workflow.png" width="500px" />

<br>

## **a reproducible workflow**

<iframe width="560" height="315" src="https://www.youtube.com/embed/s3JldKoA0zw" frameborder="0" allowfullscreen></iframe>

<br>

## **reproducibility limitations**

- top down
- downstream (post publication)
- ultimately does not address the key question: 

    > **can we trust these results?**

<br>

## **evidence based science**

evdence needs:

- **documenting**
- **linking**
- **communicating**

<br>

`rmarkdown` can integrate **tools, processes** and **outputs** into **evidence streams** that are easily shareable

> at all stages of scientific process

<br>

## **simple tools:** 
### low hanging fruit

- begin at the start of the process
- document & interlink evidence streams
- explore and communicate!

> empower your code and data

<br>

## **Science and the web**

#### **the web is for sharing!**

<img src="assets/www.jpg" width="650px" />


###### [Executive summary of the first ever website](http://info.cern.ch/hypertext/WWW/Summary.html)

<br>

## **why sharing is important**

<img src="assets/pgls.png" width="350px" />

> <small> To help solve these problems, we make a number of suggestions including providing blog posts or videos to explain new methods in less technical terms, encouraging reproducibility and code sharing, making wiki-style pages summarising the literature on popular methods, more careful consideration and testing of whether a method is appropriate for a given question/data set, increased collaboration, and a shift from publishing purely novel methods to publishing improvements to existing methods and ways of detecting biases or testing model fit. Many of these points are applicable across methods in ecology and evolution, not just phylogenetic comparative methods.</small>

<br>

## **examples**

#### [report](file:///Users/Anna/Google%20Drive/bird%20trait%20networks/outputs/Reports/Results/CorNetwork_plots.html)

#### [code documentation](http://rpubs.com/annakrystalli/123196)

#### [method collation](file:///Users/Anna/Google%20Drive/bird%20trait%20networks/outputs/Reports/workflow%20documentation/Hierarchical%20Networks.htm)

#### [interactive documents](http://rpubs.com/annakrystalli/161760)

#### [presentations](http://rpubs.com/annakrystalli/133391)

***

<br>
<br>



# Let's go have a look

Open your first `.Rmd`!!

<img src="assets/newmd.gif" width="500px" />

***

<br>
<br>

# **Elements of .Rmd**

<br>

## **YAML header**

### outputs
<img src="assets/outputs.png" width="500px" />

## **themes**

<img src="assets/theme.png" width="400px" />

### <http://bootswatch.com/>

***

<br>

# **md basics**

<br>

## **text**

        normal text
normal text

        *italic text*
*italic text*

        **bold text**
**bold text**

        ***bold italic text***
***bold italic text***

        superscript^2^
superscript^2^

        ~~strikethrough~~
~~strikethrough~~ 

<br>

## **headers**

![](assets/headers.png)

## **unordered lists**

![](assets/bullets.png)

## **ordered lists**

![](assets/numbered.png)

## **quotes & code**

    > this text will be quoted
   
 > **this text will be quoted**
 
    `this text will appear as code` inline

`this text will appear as code` inline


```{r}
a <- 10
```

        the value of parameter *a* is `r a`

the value of parameter *a* is `r a`

<br>

## **images**

        ![](https://www.rstudio.com/wp-content/uploads/2015/01/rmarkdown-cheatsheet-2-e1457627578814.png)
        
        ![](assets/cheat.png)
        
![](assets/cheat.png)

##

### **resize images**

        <img src="assets/cheat.png" width="200px" />

<img src="assets/cheat.png" width="200px" />

## **basic tables**

    Table Header  | Second Header
    ------------- | -------------
    Cell 1        | Cell 2
    Cell 3        | Cell 4 

Table Header  | Second Header
------------- | -------------
Cell 1        | Cell 2
Cell 3        | Cell 4 

[**online .md table converter**](http://www.tablesgenerator.com/markdown_tables)

<br>

## **links**

    [Download R](http://www.r-project.org/)    
    [RStudio](http://www.rstudio.com/)
    

[Download R](http://www.r-project.org/)    

[RStudio](http://www.rstudio.com/)

<br>

## **`.md` resources**

<smaller>
[offical markdown documentation](http://daringfireball.net/projects/markdown/basics)

[Rmarkdown documentation](http://rmarkdown.rstudio.com/)

[Rstudio Rmarkdown cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)

    
[github.io websites: eg Andy South's blog](http://andysouth.github.io/blog-setup/)

[Reproducible Research](https://www.coursera.org/learn/reproducible-research) coursera MOOC

[Producing html documents from `.R` scripts using `knitr::spin`](http://deanattali.com/2015/03/24/knitrs-best-hidden-gem-spin/)

</smaller>

# **chunks**

##

R code chunks can be used as a means render R output into documents or to simply display code for illustration

<img src="assets/markdownChunk.png" width="500px"/>


## **options**

<img src="assets/chunks.png" width="500px"/>

for more details see <http://yihui.name/knitr/>

## **set default options**

```{r, eval=TRUE}
knitr::opts_chunk$set(echo = TRUE, warning = F, message = F)
```

***
<br>

# **extras**

## **`knitr::kable()` tables**

```{r, warning=FALSE, message=FALSE}
require(knitr)
data(airquality)
kable(head(airquality), caption = "New York Air Quality Measurements")
```

<br>

## **`DT::datatable()` tables**
```{r, warning=FALSE, message=FALSE}
require(DT)
data(airquality)
datatable(airquality, caption = "New York Air Quality Measurements")
```

<br>

## **[plotly](https://plot.ly/)**

```{r, warning=FALSE, message=FALSE, eval=FALSE}
library(plotly)

set.seed(100)
d <- diamonds[sample(nrow(diamonds), 1000), ]

p <- ggplot(data = d, aes(x = carat, y = price)) +
  geom_point(aes(text = paste("Clarity:", clarity)), size = 1) +
  geom_smooth(aes(colour = cut, fill = cut)) + facet_wrap(~ cut)

ggplotly(p)

```

## 

```{r, warning=FALSE, message=FALSE, echo=FALSE, fig.height=5, fig.width=8.3}
library(plotly)

set.seed(100)
d <- diamonds[sample(nrow(diamonds), 1000), ]

p <- ggplot(data = d, aes(x = carat, y = price)) +
  geom_point(aes(text = paste("Clarity:", clarity)), size = 1) +
  geom_smooth(aes(colour = cut, fill = cut)) + facet_wrap(~ cut)

ggplotly(p)

```

<br>

## **shiny**

<img src="assets/shiny.png" width="500px"/>

### <http://shiny.rstudio.com/>

<br>

## **rpubs**

### <http://rpubs.com/>

<img src="assets/rpubs.png" width="100px" />
<img src="assets/rpubs_ui.jpg" width="400px" />

***

<br>


# **Exercise**

## **your mission**

> *create your first `.Rmd`!*

- choose some data eg: 
    + [`datasets`](https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/00Index.html) package 
    + `data(package = .packages(all.available = TRUE))`
    + [Caroline Thomson's blue tit data](https://osf.io/n3jgy/)
- show us some data in a table
- plot some data
- write a bit about what you did
- publish it on rpubs. Add you link to our [googledoc](https://docs.google.com/document/d/18BF4ZalGCLYSmFGSsR2kSi0Vpl76sCZXPMKkQs0SKtQ/edit?usp=sharing)

see my example: **beavers!** [html](http://rpubs.com/annakrystalli/200119) - [raw .Rmd](https://raw.githubusercontent.com/annakrystalli/ISBE_symposium/master/markdown/beavers.Rmd)


